refactor(gmail): replace hand-rolled email construction with mail-builder#491
refactor(gmail): replace hand-rolled email construction with mail-builder#491malob wants to merge 2 commits intogoogleworkspace:mainfrom
Conversation
…lder Replace custom MessageBuilder, RFC 2047 encoding, header sanitization, and address encoding (including googleworkspace#482) with the mail-builder crate (Stalwart Labs, 0 runtime deps). Each command builds a mail_builder::MessageBuilder directly. Introduce structured types throughout: - Mailbox type (parsed display name + email) replaces raw string passing - sanitize_control_chars strips ASCII control characters (CRLF, null, tab, etc.) at the parse boundary — defense-in-depth for mail-builder's structured header types, superseding sanitize_header_value, sanitize_component, and encode_address_header from googleworkspace#482 - OriginalMessage fields use Option<T> instead of empty-string sentinels - parse_original_message returns Result with validation (threadId, From, Message-ID) - Pre-parsed Config types (SendConfig, ForwardConfig, ReplyConfig) with Vec<Mailbox> — parse at the boundary, not downstream - parse_forward_args and parse_send_args return Result with --to validation, consistent with parse_reply_args - parse_optional_mailboxes helper normalizes Some(vec![]) to None for optional address fields (--cc, --bcc, --from) - Envelope types borrow from Config + OriginalMessage with lifetimes - Message IDs stored bare (no angle brackets), parsed once at boundary - References stored as Vec<String> instead of space-separated string - ThreadingHeaders bundles In-Reply-To + References with debug_assert for bare-ID convention - Shared CLI arg builders (common_mail_args, common_reply_args) eliminate duplicated --cc/--bcc/--html/--dry-run definitions Additional improvements: - finalize_message returns Result instead of panicking via .expect() - Mailbox::parse_list filters empty-email entries (trailing comma edge case) - format_email_link percent-encodes mailto hrefs to prevent parameter injection - Forward date handling: omits Date line when absent instead of showing empty "Date: " - Dry-run auth: log skipped auth as diagnostic instead of silently discarding errors - Restore --html tips in after_help strings (gmail_quote CSS, cid: image warnings, HTML fragment advice) lost in release PR googleworkspace#434 - Update execute_method call for upload_content_type parameter (googleworkspace#429) Delete: MessageBuilder, encode_header_value, sanitize_header_value, encode_address_header, sanitize_component, extract_email, extract_display_name, split_mailbox_list, build_references.
Consistent with +reply, +reply-all, and +forward which already support --from. Uses the same parse_optional_mailboxes path and apply_optional_headers plumbing.
🦋 Changeset detectedLatest commit: e0cd622 The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly refactors the email construction process within the Gmail helpers by integrating the Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This is an excellent and comprehensive refactoring that significantly improves the Gmail helper functionality. Replacing the manual email construction with the mail-builder crate is a major step forward for correctness, maintainability, and security. The introduction of the Mailbox struct for typed address handling, along with the consistent use of Result for error propagation, makes the code much more robust and easier to follow. The attention to security, including the addition of control character sanitization and tests for header injection, is commendable. The new --from flag on the +send command is also a great addition for consistency. Overall, this is a very high-quality pull request.
…gration Consolidates PRs #491, #513, #517, and #502 into a single rollup: - Migrate message construction to mail-builder crate (RFC-compliant MIME) - Add --from flag to +send for send-as alias support - Add --attachment flag to +send with MIME auto-detection and path validation - Add +read helper for extracting message body/headers (text, HTML, JSON) - Serialize support for OriginalMessage and Mailbox types - Display name quoting handled natively by mail-builder
…#526) * refactor(gmail): replace hand-rolled email construction with mail-builder Replace custom MessageBuilder, RFC 2047 encoding, header sanitization, and address encoding (including #482) with the mail-builder crate (Stalwart Labs, 0 runtime deps). Each command builds a mail_builder::MessageBuilder directly. Introduce structured types throughout: - Mailbox type (parsed display name + email) replaces raw string passing - sanitize_control_chars strips ASCII control characters (CRLF, null, tab, etc.) at the parse boundary — defense-in-depth for mail-builder's structured header types, superseding sanitize_header_value, sanitize_component, and encode_address_header from #482 - OriginalMessage fields use Option<T> instead of empty-string sentinels - parse_original_message returns Result with validation (threadId, From, Message-ID) - Pre-parsed Config types (SendConfig, ForwardConfig, ReplyConfig) with Vec<Mailbox> — parse at the boundary, not downstream - parse_forward_args and parse_send_args return Result with --to validation, consistent with parse_reply_args - parse_optional_mailboxes helper normalizes Some(vec![]) to None for optional address fields (--cc, --bcc, --from) - Envelope types borrow from Config + OriginalMessage with lifetimes - Message IDs stored bare (no angle brackets), parsed once at boundary - References stored as Vec<String> instead of space-separated string - ThreadingHeaders bundles In-Reply-To + References with debug_assert for bare-ID convention - Shared CLI arg builders (common_mail_args, common_reply_args) eliminate duplicated --cc/--bcc/--html/--dry-run definitions Additional improvements: - finalize_message returns Result instead of panicking via .expect() - Mailbox::parse_list filters empty-email entries (trailing comma edge case) - format_email_link percent-encodes mailto hrefs to prevent parameter injection - Forward date handling: omits Date line when absent instead of showing empty "Date: " - Dry-run auth: log skipped auth as diagnostic instead of silently discarding errors - Restore --html tips in after_help strings (gmail_quote CSS, cid: image warnings, HTML fragment advice) lost in release PR #434 - Update execute_method call for upload_content_type parameter (#429) Delete: MessageBuilder, encode_header_value, sanitize_header_value, encode_address_header, sanitize_component, extract_email, extract_display_name, split_mailbox_list, build_references. * feat(gmail): add --from flag to +send for send-as alias support Consistent with +reply, +reply-all, and +forward which already support --from. Uses the same parse_optional_mailboxes path and apply_optional_headers plumbing. * fix: quote display names with RFC 2822 special characters in +reply When replying to emails from corporate senders with display names like "Anderson, Rich (CORP)" <email@adp.com>, the +reply command fails with "Invalid To header" (400) from the Gmail API. The root cause: encode_address_header() strips quotes from the display name via extract_display_name(), then reconstructs the address without re-quoting. When the display name contains RFC 2822 special characters (commas, parentheses), the unquoted form is ambiguous — commas split it into multiple malformed mailboxes and parentheses are interpreted as RFC 2822 comments. Fix: re-quote the display name when it contains any RFC 2822 special characters, using a single-pass character iterator that preserves already-escaped sequences and escapes bare quotes/backslashes. Fixes #512 * feat(gmail): add --attachment flag, +read helper, and mail-builder migration Consolidates PRs #491, #513, #517, and #502 into a single rollup: - Migrate message construction to mail-builder crate (RFC-compliant MIME) - Add --from flag to +send for send-as alias support - Add --attachment flag to +send with MIME auto-detection and path validation - Add +read helper for extracting message body/headers (text, HTML, JSON) - Serialize support for OriginalMessage and Mailbox types - Display name quoting handled natively by mail-builder * chore: regenerate skills [skip ci] * fix: use validate_safe_file_path for attachment path validation Addresses Gemini review: validate_safe_dir_path hardcodes '--dir' in error messages. validate_safe_file_path accepts the flag name, so errors now correctly reference '--attachment'. * refactor: make OriginalMessage.thread_id optional The Gmail API does not guarantee threadId on all message resources (e.g. drafts). Making it Option<String> prevents parse failures on valid messages and avoids requiring thread_id in helpers like +read that don't use it. * fix: use canonicalized path for attachment file operations (TOCTOU) validate_safe_file_path returns a canonicalized PathBuf. Use it for exists/is_file checks and downstream file reads instead of the original un-resolved path to prevent time-of-check/time-of-use races. * feat(gmail): add --attach flag for file attachments Add -a/--attach to +send, +reply, +reply-all, and +forward. Can be specified multiple times for multiple attachments. MIME type is auto- detected via mime_guess2. Closes #247. Send via the Gmail API upload endpoint (multipart/related with message/rfc822 media type) instead of base64-encoding into a JSON raw field. This raises the size limit from ~5MB (metadata-only endpoint) to 35MB (upload endpoint, per discovery document). Introduce UploadSource enum in the executor to consolidate upload_path, upload_content_type, and upload_bytes into a single type-safe parameter. File and Bytes variants make the two upload strategies (from disk vs. from memory) mutually exclusive by construction. Validates attachment paths (control characters, regular file, non-empty) and total size (25MB raw limit, accounting for base64 expansion of attachments within the MIME message against the 35MB API limit). Size check uses actual bytes read to avoid TOCTOU race. * chore: update changeset and fix integration with malob's attachment impl Update changeset to reflect combined work. Fix thread_id type mismatches in new tests from cherry-pick. Fix upload_path scope in main.rs. Make reject_control_chars pub(crate) for attachment validation. Co-authored-by: Malo Bourgon <mbourgon@gmail.com> * chore: regenerate skills [skip ci] * fix: restore MIME sanitization and terminal escape protection in executor Restore two security features accidentally lost during the UploadSource refactor: 1. resolve_upload_mime: restructure from early-returns to collect-then- sanitize pattern — strips control chars from user-supplied MIME types to prevent CRLF header injection. 2. Model Armor error path: restore sanitize_for_terminal on error messages to prevent terminal escape sequence injection from API responses. Co-authored-by: Malo Bourgon <mbourgon@gmail.com> * chore: remove duplicate changeset from cherry-pick gmail-attach-flag.md duplicated content already in gmail-helpers-rollup.md. Both were marked minor, which would cause a double version bump. * fix: add path traversal protection to attachment validation Replace reject_control_chars with validate_safe_file_path in parse_attachments. All file operations (metadata, read, filename extraction, MIME detection) now use the canonicalized path, preventing path traversal attacks (e.g. ../../.ssh/id_rsa) and closing TOCTOU gaps. Update tests to use CWD-relative temp directories (tempdir_in(".")) since validate_safe_file_path rejects paths outside the working directory. Co-authored-by: Malo Bourgon <mbourgon@gmail.com> * refactor: deduplicate terminal sanitizer in read.rs Replace the local sanitize_terminal_output function with the existing crate::error::sanitize_for_terminal via import alias. This eliminates code duplication and provides consistent sanitization across the codebase. The crate-wide sanitizer also correctly strips CR (carriage return) which can be abused for terminal overwrite attacks. --------- Co-authored-by: Malo Bourgon <mbourgon@gmail.com> Co-authored-by: Rich Anderson <richanderson00@gmail.com> Co-authored-by: jpoehnelt-bot <jpoehnelt-bot@users.noreply.github.com> Co-authored-by: googleworkspace-bot <googleworkspace-bot@users.noreply.github.com>
|
Superseded by #526 |
Description
Replace hand-rolled MessageBuilder, RFC 2047 encoding, and header sanitization with the
mail-buildercrate (Stalwart Labs, 24KB, 1 optional runtime dep), and add--fromflag to+sendfor send-as alias consistency with+reply,+reply-all, and+forward.Why mail-builder
The Gmail helpers have grown from simple text emails to supporting HTML mode, CC/BCC, reply-all with recipient dedup, forwarding with threading, and send-as aliases. Each feature added more hand-rolled RFC 5322 logic: MIME headers, content-type selection, RFC 2047 encoding for non-ASCII names (#482), header injection prevention, and address parsing/formatting.
Adding attachment support (#247) would require multipart/mixed MIME construction — boundaries, part encoding, content-disposition headers — on top of an already complex custom implementation.
mail-buildergives us correct RFC 2047 encoding, Content-Transfer-Encoding, and MIME structure for free, plus a clean path to--attachvia native multipart support.Commit 1 — Refactor
Types:
Mailbox(parsed display name + email) replaces raw string passing.OriginalMessagefields useOption<T>instead of empty-string sentinels. Config types useVec<Mailbox>. Message IDs stored bare (no angle brackets), parsed once at the boundary.Message construction: Each command builds a
mail_builder::MessageBuilderdirectly via shared helpers (apply_optional_headers,set_threading_headers,finalize_message). The pattern is consistent across all three commands.Security:
sanitize_control_charsinMailbox::parsestrips ASCII control characters (CRLF, null, tab) at the parse boundary. This supersedessanitize_header_value,sanitize_component, andencode_address_headerfrom #482 — mail-builder's structured address types prevent header injection structurally, and parse-boundary sanitization provides defense-in-depth. End-to-end injection tests verify CRLF in--from/--cccannot create spurious headers.Behavioral changes:
parse_original_messagenow returnsResult— rejects messages missing required headers (threadId, From, Message-ID) instead of silently proceeding with empty fieldsCommit 2 —
--fromon+sendAdds the
--fromflag to+send, consistent with+reply,+reply-all, and+forward. Uses the sameparse_optional_mailboxespath andapply_optional_headersplumbing.Note on #482
This PR supersedes the RFC 2047 address header encoding merged in #482. mail-builder handles RFC 2047 automatically via structured
Addresstypes, andsanitize_control_charsinMailbox::parsestrips all ASCII control characters at the parse boundary (covering the same CRLF, null, and tab injection vectors assanitize_component/sanitize_header_value). One behavioral difference: #482'sencode_address_headeralso truncated bare emails at the first non-email character as post-CRLF-stripping cleanup; we rely on mail-builder's angle-bracket wrapping and Gmail's API validation to reject malformed addresses instead of silently truncating. Seetest_mailbox_parse_strips_*andtest_send_crlf_injection_*.Note on #395
The open attachments PR hand-rolls multipart MIME on the old MessageBuilder. This refactor provides a cleaner foundation for that feature using mail-builder's native multipart support.
Checklist:
AGENTS.mdguidelines (no generatedgoogle-*crates).cargo fmt --allto format the code perfectly.cargo clippy -- -D warningsand resolved all warnings.pnpx changeset) to document my changes.